Integrating Prosodic Information into a Speech Recogniser
نویسندگان
چکیده
In the last decade there has been an increasing tendency to incorporate language engineering strategies into speech technology. This technique combines linguistic and mathematical information in different applications: machine translation, natural language processing, speech synthesis and automatic speech recognition (ASR). In the field of speech synthesis, this hybrid approach (linguistic and mathematical/statistical) has led to the design of efficient models for reproducing the acoustic features of natural language. However, the incorporation of language engineering strategies into ASR is only beginning. In this paper, we present a theoretical framework for the integration of linguistic information into an ASR system. The objective is to design a model which can detect the suprasegmental features of the speech input, mainly those related to the fundamental frequency (F0) that can clarify the functionality of pauses, intonation contour, and interruptions. This specification model has been designed in the framework of a dialogue system.
منابع مشابه
The use of prosody in a combined system for punctuation generation and speech recognition
In this paper, we discuss a combined system for punctuation generation and speech recognition. This system incorporates prosodic information with acoustic and language model information. Experiments are conducted for both the reference transcriptions and speech recogniser outputs. For the reference transcription case, prosodic information is shown to be more useful than language model informati...
متن کاملA generalised model for utilising prosodic information in continuous speech recognition
Prosodic features in continuous speech provide cues which may be used to disambiguate syntactic ambiguities and to increase the accuracy of speech recognition/understanding systems. This paper presents a novel method using a multivariate statistical framework for producing a model of the relationship between prosodic and syntactic structures in continuous speech. The model can be used for Lingu...
متن کاملIntegrating Prosodic and Lexical Cues for Automatic Topic Segmentation
We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topically coherent units. We propose two methods for combining lexical and prosodic information using hiddenMarkov models and decision trees. Lexical information is obtained from a speech recognizer, and prosodic features are extracted automatically from speech waveforms. We e...
متن کاملDoctoral Thesis Proposal Automatic Detection and Classification of Prosodic Events
Speech prosody is a valuable carrier of information. Accents and phrase boundaries have been shown to contribute to syntactic disambiguation, semantic, pragmatic and paralinguistic interpretation, and to convey information about topicality, focus, contrast and information status. This thesis will present and evaluate techniques to detect and classify these prosodic events. The acoustic correlat...
متن کاملConcept-to-speech generation by integrating syntagmatic features into HMM-based speech synthesis
In conventional concept-to-speech (CTS) methods, a common step is predicting abstract prosodic descriptions, such as the locations of accents and phrase boundaries, from the linguistic information provided by the text generation module. But the prediction results always contain errors, and unacceptable prosodic prediction may ruin the synthesized speech. In addition, linguistic information, whi...
متن کامل